Hybrid 2D–CMOS microchips for memristive applications

Exploiting the excellent electronic properties of two-dimensional (2D) materials to fabricate advanced electronic circuits is a major goal for the semiconductor industry1,2. However, most studies in this field have been limited to the fabrication and characterization of isolated large (more than 1 µm2) devices on unfunctional SiO2–Si substrates. Some studies have integrated monolayer graphene on silicon microchips as a large-area (more than 500 µm2) interconnection3 and as a channel of large transistors (roughly 16.5 µm2) (refs. 4,5), but in all cases the integration density was low, no computation was demonstrated and manipulating monolayer 2D materials was challenging because native pinholes and cracks during transfer increase variability and reduce yield. Here, we present the fabrication of high-integration-density 2D–CMOS hybrid microchips for memristive applications—CMOS stands for complementary metal–oxide–semiconductor. We transfer a sheet of multilayer hexagonal boron nitride onto the back-end-of-line interconnections of silicon microchips containing CMOS transistors of the 180 nm node, and finalize the circuits by patterning the top electrodes and interconnections. The CMOS transistors provide outstanding control over the currents across the hexagonal boron nitride memristors, which allows us to achieve endurances of roughly 5 million cycles in memristors as small as 0.053 µm2. We demonstrate in-memory computation by constructing logic gates, and measure spike-timing dependent plasticity signals that are suitable for the implementation of spiking neural networks. The high performance and the relatively-high technology readiness level achieved represent a notable advance towards the integration of 2D materials in microelectronic products and memristive applications.

, in which multiple tungsten vias can be observed. The size of the image is 10 µm × 10 µm. This region corresponds to that of a single memristor (center) surrounded by many disconnected vias (often referred to as "dummy" vias). b, Cross-sectional view of a metallic via shown in the center of panel a.
Supplementary Figure 4 | Wafer processing for cross-sectional TEM inspection. SEM images of the wafer at different stages of the focused ion beam process that exposed the transistors and vias. The transistors only appeared at the end of the process, i.e., when the lamellas were very thin, due to their small size.

Supplementary Note 1: switching mechanism of the 1T1M cells
Previous stuies on h-BN based memristors have reported filamentary RS [1][2][3][4][5][6][7][8][9][10] . In none of those studies a series transistor was used. By combining multiple experiments, we conclude that the nonvolatile bipolar RS observed in the 1T1M cells with ~0.053µm 2 memristors ( Figure 2c of the main text) is non-filamentary, and related of ionic exchange at the interfaces between the h-BN and the electrodes when the h-BN is softly degraded (something that can only be done with the series CMOS transistor).
First, if we repeat the fabrication process but without transferring h-BN, the resulting Au/Ti/W, Au/W and Ag/W nanojunctions (with and without series CMOS transistor) don't exhibit RS. The Au/W and Ag/W nanojunctions exhibit ohmic conduction, and the Au/Ti/W nanojunctions exhibit slightly lower current due to the higher resistance of the Ti layer (which could absorb some oxygen, as shown in Extended Data Figure 3). In fact, atomic-layer-deposited 2-nm-thick TiO2 also exhibits nonfilamentary non-volatile bipolar RS (Supplementary Figure 7). However, atomic-layer-deposited 2-nmthick TiO2 is much more insulating than our O-rich Ti interfacial film, which is not responsible for the switching in our 1T1M cells because, as mentioned: i) devices with Au/Ti electrodes and without h-BN do not switch, and ii) devices without Au/Ti electrodes and with h-BN switch well (see Figure 2g of the main text). Hence, the non-volatile bipolar RS is related to the h-BN stack.
Second, our h-BN/CMOS based 1T1M cells biased at VG=1.1V exhibit: i) very high RLRS, ii) non-linear currents in both states, and iii) very progressive state transitions. These are very strong indications and widely-accepted evidences of non-filamentary RS, as explained in multiple seminal articles 11-12 . Third, when VG is increased to 1.5V the currents in LRS become linear, as shown in the Supplementary Figure 8; this further confirms that the RS at VG=1.1V is non-filamentary. Note that this behaviour is not stable and erratic transitions between filamentary and non-filamentary RS are detected.
And fourth, we have also conducted cross-sectional TEM coupled with EELS and EDX in our ~0.053µm 2 memristors (see Supplementary Figure 9a). We do see a partial degradation of the device, i.e. some black particles partially penetrate in the insulating h-BN. However, the size of these particles are very small compared to the filaments observed in 5µm 2 devices (see Supplementary Figure 9b), and they appear to be discontinuous. Moreover, we don't detect metal penetration via EELS/EDS. For all these reasons, we cannot claim that the RS is filamentary. In our opinion, this is nonfilamentary switching produced by the partial degradation of the h-BN consistent with the activation process detected (see blue lines in Figure 2c and Extended Data Figure 4).

Additional consideration
In memristors made of amorphous metal-oxides, the filamentary or non-filamentary nature of the RS could be also deduced by measuring devices with diffent areas. In non-filamentary devices the currents driven in LRS and HRS depend on the size, but in filamentary devices the currents driven in LRS do not depend on the size [11][12] . However, in polycrystalline metal-oxides this experiment may not be conclusive because, when the size of the memristor is smaller than the grain size (i.e., the device does not contain any grain boundary), a sharp change in the electrical properties could be observed.
In our case, the CVD h-BN is polycrystalline, i.e., it includes 2D layered regions that are electrically very insulating and clusters of defects that are more conucting -these clusters of defects may be related to grain boundaries or just lattice distortions that propagate from one layer to another [7][8] . This means that the electrical properties of a large device (>10µm 2 ) including one/few cluster/s of defects could be completely different to those of a small device (<0.1µm 2 ) made entirely of 2D layered h-BN. To confirm this hypothesis, we have fabricated h-BN memristors with different sizes: ~0.053µm 2 , 0.5625µm 2 , and 5µm 2 . What we observe is: i) The 0.5625µm 2 and 5µm 2 devices show clear dielectric breakdown followed by filamentary non-volatile bipolar RS for tens/hundreds of cycles. The filamentary nature of the RS in these devices is confirmed by the sharp set/reset transitions, the very low value of RLRS, and the linear currents detected in LRS. Before the first breakdown, no stable RS is seen. And ii) the ~0.053µm 2 devices exhibit stable non-volatile RS with: i) very high RLRS, ii) non-linear currents, and iii) very progressive state transitions, as shown in the manuscript. As mentioned, these are very strong indications and widely-accepted evidences of non-filamentary RS 1-5 . , while the small devices show partial degradation with just few small particles displacement. These particles do not seem to be forming an effective filament but it seems to be partially broken (having nanogaps). These devices do not show metal migration in the EELS/EDX maps.

Supplementary Note 2: Endurance benchmarking
When benchmarking the endurance of memristive devices, one has to be very careful with two issues: 1 -Most articles that claimed high endurance (>10 6 cycles) studied large devices (>1µm 2 ). The fact that one large device exhibits one kind of RS does not mean that a small device (<0.1µm 2 ) made of the same materials will exhibit similar RS. This is evidenced by the fact that multiple articles have claimed high endurance up to ~10 12 cycles in large devices (many of them even without using a series transistor to limit the current overshoot), but companies like Fujitsu and Intel/Micron qualify their commercial (small) metal-oxide and phase-change memories for only 0.5 million and 10 million cycles (respectively), despite using professional testing vehicles and selector elements [14][15] .
It is widely known that in most memristive devices the forming voltage remarkably increases (statistically) in smaller devices 9,[16][17][18][19] , due to the lower probability to find native defects that trigger the dielectric breakdown (see Supplementary Figure 10a-d). Materials with lower density of defects tend to show no/bad RS. For example, SiO2 produced by thermal oxidation and mechanically exfoliated h-BN (very low density of native defects) do not show stable RS, while SiO2 produced by sputtering and CVDgrown h-BN (high density of native defects) exhibit stable RS 8,20 . In general, higher breakdown voltages in materials with lower amount of native defects will produce wider filaments (see Supplementary  Figure 10e-h); moreover, the higher electrical field will contribute to generate electromigration and selfaccelerated avalanche currents, possibly resulting in filaments with different chemical composition (see the orange and red balls in the filaments of Supplementary Figure 10h). Larger filaments are good to enhance the state retention time, but at the same they reduce the switching endurance because they tend to get stuck in LRS (this is an inherent trade-off in many memristive devices 21 ).
Hence, when benchmarking the endurance of a memristor, the concept of device size is very important. It is misleading to measure an endurance of 10 billion cycles in a memristor with size of >10µm 2 and claim that it could be used as electronic memory, because electronic memories require a high integration density and high endurance (simultaneously). Such endurance must be confirmed in smaller devices with an integration density suitable for memory applications (<0.1µm 2 ). Supplementaary Figure 10i-j show the concept idea of our devices: the high endurance comes from the fact that no complete dielectric breakdown is triggered.
2 -Many plots used to claim high endurance present very few data, i.e., read RLRS and RHRS just one/few times per decade for only one device (see Supplementary Figure 11a). Such type of claim is extremely weak. We note that multiple claims of high endurance in novel nanomaterials have never been reproduced by other groups. Recently, a group of experienced researchers in the field of RS published an article exposing this bad practice 22 , i.e., explaining why the characterization method shown in Supplementary Figure 11a is unreliable, and urging the community to employ a reliable method, which consists on measuring RLRS and RHRS in each cycle for few devices (see Supplementary Figure 11b).
For these reasons, when benchmarking the endurance of our ~0.053µm 2 h-BN memristors, we only compare with articles that report high endurance in memristors with sizes <1µm 2 and that present an abundant number of data points for RHRS and RLRS. We selected 1µm 2 as threshold because this device size starts to be attractive for on-chip memory applications. Note that, due to the limited resolution of the plots, we cannot be 100% sure if an endurance plot in an article read RHRS and RLRS in every cycle, unless the authors explicitly mention it in the text -we fully support data availability policies from some publishers. Therefore, we will select those studies in which the number of data points is so abundant that they cannot be counted (i.e., the number of data points is indistinguishable).
We have made a deep literature review and we only found 4 studies that demonstrated (presenting abundant data on RHRS and RLRS) high endurance in small (<1µm 2 ) memristors, as shown in Supplementary Table 1. In conclusion, the ~0.053µm 2 h-BN memristive devices presented in our study achieved a high endurance, which is competitive compared with memristors made of other much more mature materials, such as metal-oxides, phase-change materials and amorphous silicon.

Supplementary Table 1 | Studies that report high endurance in small memristors with sizes below 1 µm 2 .
These studies used the correct characterization method (one data point per cycle, which is the only one accepted by most of the experts in this field, see reference 22).  Supplementary Tables 2-3 show all the studies claiming high endurance that we discarded, either because they employ a large device size (>1µm 2 ) or because they presented very few data points in their endurance plots (or both reasons).

Supplementary Note 3: Spike-timing dependence plasticity for spiking neural networks
1T1M cells with Au/Ti/h-BN/W memristors exhibit spike-timing dependent plasticity (STDP) when applying pulsed voltage stress (PVS) displaced in time (Δt) to the top (at time tpost) and bottom (at tpre) electrodes of the Au/Ti/h-BN/W memristors. The device conductance variation (ΔG as a function of Δt, with Δt= tpost-tpre) after each pair of pulses was measured with respect to the initial conductance (GINITIAL) and the resulting STDP characteristic is presented in Supplementary Figure 13, showing an exponential trend instead a linear one as reported in previous cases 8,82 . This non-volatile resistive switching performance is very attractive to construct electronic synapses for Spiking Neural Networks (SNNs) 83 , which have become very attractive because they consume less energy than traditional deep neural networks 84 . Based on the measured STDP characteristic (which we fitted with a piecewise exponentially decaying model as shown in Supplementary Figure 14) we emulated the performance of a memristive SNN for classifiying 85  The SNN architecture 85 is described in Figure 3a. This network has been developed using Brian2 87 , an SNN simulator written in Python. Our SNN comprises 784 input neurons (one for each pixel in the dataset images), and 400 excitatory and inhibitory neurons (the SNN structure is then 784×400×400). The selected architecture allows lateral inhibition that leads the excitatory neurons to compete 85 . Neurons are modelled with the leaky integrate-and-fire (I&F) model 85,88 and to prevent any of them from dominating the response pattern we employ an adaptive membrane threshold 89 , meaning that the firing threshold is increased every time the neuron fires and otherwise it decays exponentially 90 . During training, the synaptic weight update due to a stimulation protocol with pairs of pre-and postsynaptic spikes can be calculated based on the STDP function 91,92 . In this work, the online version of the STDP learning rule is employed in order to improve the simulation efficiency 93 . Further details regarding the STDP can be found in Ref. 82.
To study the impact of the device variability we considered a Monte-Carlo approach and repeated the training 50 times for each SNN, totalling 150 runs, each of them taking on ~20 hours. This approach implies that both the STDP characteristic and the initial value of the synapses changes from one Monte Carlo run to another, resulting in different conductance maps for each run (see Supplementary Figure 14). For each Monte-Carlo run, as the training progresses, the synapses connected to each of the excitatory neurons learn the general features of a given pattern. This is shown in Figure  3b, where each slice presents the 313,600 synapses arranged in 400 groups (20×20, i.e., the numeric patterns) of 784 synapses each (28×28, i.e., the pixels that form each numeric pattern); the synapses connect the input layer to the excitatory layer (green and red spheres in Figure 3a, respectively). The red square in Figure 3b indicates a group containing the 784 synapses that connect the input neurons to the first neuron of the excitatory layer. We trained the SNN with the complete MNIST training dataset and the accuracy was evaluated every 1000 images. The behaviour of the SNN is verified by the corresponding confusion matrix (see Figure 3c) and the evolution of the averaged classification accuracy during training is presented in Figure 3d. The best average accuracy reaches ~90%, which is a very high value considering the simplicity of the SNN, its similarity to biological neural networks (i.e., the learning rule employed is STDP), and the unsupervised training protocol (see Supplementary Table 4).
Aiming to implement a hardware-based SNN accelerator exploiting the capabilities of our h-BN/CMOS based 1T1M cells, we propose the CMOS circuit 94 shown in Figure 3e for emulating the electrical response of a leaky integrate and fire (leaky I&F) neuron, which is capable of accounting for the adaptative firing threshold and the refractory period after firing. The entire circuit is simulated in SPICE considering all the active devices (transistors) to be from a 180 nm commercially available CMOS process and its response is calculated while the Au/Ti/h-BN/W memristors acting as synapse is modelled with the quasi-static memdiode model of the memristor [95][96] . The correct response of the circuit is demonstrated by the presynaptic and postsynaptic traces and the evolution of the membrane potential, presented in Figure 3f-g, respectively.
Supplementary Figure 13 | Fitting of the STDP characteristic. The asymmetric STDP characteristic of the CMOS/h-BN based 1T1M cell is fitted with the STDP model presented in the inset. The fitting parameters are: A-=-0.03x10 -6 , A+=0.21x10 -6 , τ+=0.35 ms. and τ-=0.5 ms. To account for the device-todevice variability, we included a 20% variability in the previously mentioned A+ and τ+ parameters. Note that for each run, the matrix of learned patterns changes due to i) the random initial value of the synapses and ii) the variability of the STDP characteristic. i.e. in the first run the 1 st excitatory neuronupper left corner and highlighted by a red square-learns to recognize, for instance, the digit n° "0", then for the second run the same neuron learns to recognize the digit n° "2", for the third run digit n° "4" and for the fourth run digit n° "9"). Note that apart from different synaptic conductance maps, the deviceto-device variability also produces changes in the conductance evolution (i.e. after 10,000 images, the conductance of the potentiated synapses in the map corresponding to the 4 th Monte-Carlo run are sensibly lower than those from the 1 st and 2 nd Monte-Carlo runs. Note that for the sake of readability, in this case we have considered the case of 100 neurons, as for the 400 neurons case the patters are too small to be properly displayed in the figure.